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COMPARING METRICS ON ARBITRARY SPACES USING TOPOLOGICAL 

DATA ANALYSIS 

SCOTT BALCHIN AND ETIENNE PILLIN 


Abstract. We use the notion of topological data analysis to compare metrics on data sets. We 
provide two different motivating examples for this. The first of these is a point cloud data set that 
has as its ambient space, and is therefore very visual, the second deals with a very abstract 
space which arises through the study of non-transitive dice. 
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Introduction 

Up until now, topological data analysis has mainly been used to study the structure of a data 
set which is endowed with a distance. However, to the authors’ knowledge, these tools have not 
been used to study how the structure of data can change when the metric is varied. This is mainly 
due to the fact that the main application of topological data analysis lies in computer vision, where 
the data is naturally given the Euclidean metric (see [5] for an overview of the current state of the 
theory). It is only when the data in question has no canonical metric or ambient space that one 
would have to compare how a change in metric can change the structure. 

We begin by briefly introducing the main tools of topological data analysis, before applying it 
to a point cloud in which we generate. We then endow this data with three different metrics, 
namely the Euclidean, taxicab and supremum metrics. By studying the three different barcodes 
arising, we can see how changing the metric can affect the global structure of the data. We present 
this along with images of how the simplicial complex develops in each metric, giving some visual 
intuition on what the change in metric is doing. 

After this visual example we move on to a more abstract setting. Non-transitive dice are a 
relatively underdeveloped problem, a handful of people have provided an insight into this under¬ 
standable, yet highly complex setting (for example see [TU], [1], 0). We will introduce the basic 
requirements to understand problem, making the example more approachable. There is no canon¬ 
ical metric to put on the space that arises, therefore we suggest a handful and compare the results 
using topological data analysis. 


1. Topological Data Analysis 

1.1. Simplicial Complexes. The general motto of topological data analysis is to give a new way 
to study large data sets. By converting data into a topological space, one can apply algebraic 
topology tools to it in order to infer results. Namely, one usually uses persistent homology to 
construct a so-called barcode which can help identify noisy and persistent features. For us to be 
able to carry out these techniques, we require a distance d on the data, for all data points x, y and 
z we have: 

(1) d{x,y) ^ 0 

(2) d{x,y) = d{y,x) 

(3) d{x, y) + d{y, z) ^ d{x, z) 
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Note that with the above distance, we may have a finite number of points which are indistin¬ 
guishable as they have distance 0, however the simplicial methods that we will use will encode this. 
Even though the axioms above only give a pseudometric, most of our considered distances will be 
a true metric. We will construct a series of topological spaces from the data set, namely the Rips 
complexes (see El). 

Definition 1.1; 

Given a finite collection of points {xa}, the Rips complex TZ^ is the abstract simplicial complex 
whose fe-simplices correspond to unordered {k + l)-tuples of points {x^Iq that are pairwise within 
distance e. 

Remark 1.1; 

One could also use the Cech complex construction to get a simplicial complex. Computationally 
speaking, the Rips complexes are simpler to code as they are flag complexes, meaning that their 
structure is entirely determined by their 1-skeletons. 

As we vary e, there is a series of inclusions: 

7^o c c • • • c 

Remark 1.2; 

Note that if we do have a collection of k points such that they are indistinguishable with respect 
to our distance measure, then they will be encoded in a A:-simplex at e = 0. This justifies our 
reasoning that we do not need to consider a strict metric on our data set. 

1.2. Simplicial and Persistant Homology. A key tool in algebraic topology is that of homology 
(see [7] for the main source on algebraic topology). In short, the fc-th homology group of a simplicial 
complex computes the number of fc-dimensional holes. When we have a whole series of complexes, 
it is interesting to see when new holes form and when old ones are killed off. Persistent homology 
is the tool that allows us do just this (see [T] for an overview). We can track individual holes, and 
then compute at which distance e the hole dies, or a new hole is born. We then encode this data 
into a barcode. We now formally introduce these ideas. 

Definition 1.2; 

Let S' be a simplicial complex, a simplicial k-chain is a linear sum of A:-simplices 

N 

ZlciCT* 

where Ci e Z and fi* 6 S' is the i-th /c-simplex. The group of fe-chains on 5 is a free abelian group 
with basis being all of the A:-simplices. We denote it C^- 

Definition 1.3; 

Given a /c-simplex a = ,v^,... where the u* are vertices, there is a boundary operator 

dk ■ Ck ^ Ck-i which is defined by 

k 

Skier) = 

i=0 

where the hat indicates that we delete that vertex. 

Definition 1.4; 

The k-th homology group of S is defined to be the quotient 

Hk{S) = ker(4)/im(4+i) 
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Note that this definition is valid as it can be shown that {Ck,dk) is a chain complex. Finally, we 
define the Betti numbers of S to be 

/3k = rank(i7fc(5)) 

Finding the Betti numbers of our simplicial complexes already tells us a lot about the general 
structure of the data. However, what we really need to know is how the homology changes when 
we add new simplices, specifically tracking the behaviour of individual holes. This is achieved by 
using persistent homology. We let TZ denote our series of Rips complexes. 

Definition 1.5; 

For i < j, the (i, j)-persistent homology of TZ is the image of the induced homomorphism in 
homology H^{TZ^) Hif,{TZ^). 

We will track the Betti numbers through persistent homology, which will tell us in which e 
interval each Betti number exists. These intervals are what we are interested in. When plotted, 
they indicate which holes are noise and which ones are actually relevant to the structure of the 
data. We plot it into a barcode, an example of one is given in figure 


Figure 1. An example of a barcode - The x axis gives the value of e, and the y 
axis gives the homology degree. A bar represents the lifespan of a hole in a given 
homology degree. 

1.3. Comparing Metrics With Topological Data Analysis. The main outlook of this paper 
is to show that we can use topological data analysis to compare different metrics on a single data 
space. We will fix a data set A, which we will endow with a collection of metrics We wish 

to introduce a qualitative measure between metrics. We can do this by performing topological data 
analysis on X with respect to all of the metrics, then compare the resulting barcodes. 

It should be noted at this point that there are distances between persistence diagrams, namely 
the bottleneck and Wasserstein distance m- This would give us a quantitative measure of the 
difference between the metrics. In the authors’ opinion, this is better suited to calculate the error 
associated to small perturbations of the data rather than a change in metric. 

Instead of using the distances been barcodes, we will give numerical results for each homology 
class of each metric. Once we have these numbers we can compare them among the other metrics 
to get a feel for how similar they are. The statistics that we will be interested in are the average 
lifespan of a hole and the number of holes in each dimension. 
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Definition 1.6; 

Suppose that we have a barcode associated to a data set. Without loss of generality, we will assume 
that e ranges between 0 and 1 (we can do so by diving all distances by the maximum distance). 
Then: 

- The expected lifespan of a hole in dimension n is the average length of the n-dimensional 
bars. 

- The number of holes in dimension n is the number of bars which are born in the entire span 
of e. 

1.4. Algorithms. In the above sections we have covered the basic theory of topological data 
analysis, however this did not give much insight into how one could calculate such invariants of 
a data set. We represent the distance between the data points as a distance matrix, which by 
construction is positive valued and symmetric. The global structure of our algorithm is as follows. 


1 

2 

3 

4 

5 

6 

We will now describe the steps given in the above structure (more can be found on the theory 
of computational topology in the book of Edelsbrunner and Rarer [2] ). 

1.4.1. Building the Rips Simplicial Complex. Let N be the upper bound on the dimension of the 
simplicial complex. We start by building the edges from the list of vertices which are pairwise within 
a distance below the current threshold e. We then inductively create the list of (/ + l)-simplices 
from that of the I-simplices, for all / 6 [1, A — 1|. 

Using an efficient algorithm to do the latter is crucial for the computation to be done within a 
reasonable amount of time, especially in high dimensions. Let us first consider a naive algorithm 
which browses the list of I-simplices using an (/ + l)-nested iterator simplex_it. For all combina¬ 
tions of (/ -I- 1) I-simplices, this iterator checks whether they form a new (I -I- l)-simplex or not. 
The complexity associated to this step is , where n/ is the number of I-simplices. Therefore 
that of the algorithm is 

It is not possible to foresee the number of simplices n/. The worst-case scenario, which occurs 
when e ^ Cmax) implies that nj = In particular, assuming N is even, we obtain that the 

A/2 th step of the algorithm is of complexity 

A 

I + l) • 

Our suggestion for improving this is to avoid browsing simultaneously the same simplices, or the 
same combinations multiple times. This can be done by ensuring that Vi e [1,I|, simplex_it [i] 
< simplex.it [i+l] . This alone reduces the complexity of the Ith step to In addition to 

this, since we build not one simplicial complex, but a series of them, we can store the previous sizes 
of all the simplex lists at the previous step. We can then make sure that one component of our 
iterator simplex_it only browses the I-simplices which have just been created. 
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Algorithm 1 - Structure of the code 

Build the data set E we want to analyze 

Calculate the chosen distance matrix over this space 

Make an ordered list of the distance thresholds (from lowest to highest) 
For all e in that list , 

Build the Rips simplicial complex 

Update the simplicial homology and barcode 





Figure 2. A diagram of the above iterator philosophy 


Finally, out of all those combinations, we only consider those which satisfy the condition 

Vi e [1, / — 1|, simplex.it [i] shares i {I — l)-simplices with \J simplex.it [j], 

j<i 

where \/ denotes an exclusive union. 

For that reason, we numerically define /-simplices as their list of (/ — l)-simplices, and not of 
vertices. 

Algorithm 2 - Constructing the Rips Complex 

1 For it [0] in the list of /—simplices of index previous_size to size 

2 For it [1] in the list of /—simplices of index 1 to size—1 such that it shares 1 

(/—1)—simplex with it[0] 
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4 For it [i] in the list /—simplices of index i to size—i such that it 

shares i (/ —1)—simplices with simplex_it [j] 



The above algorithm implies defining (/ + l)-times nested loops, for / e [1, Af — 1]. We came up 
with an elegant implementation of this by developing static loops using template metaprogramming 
in C++. This way, our program is capable of running for any dimension D provided at compile time 
without being penalized for it. 

1.4.2. Computing the Barcodes. To compute the persistent homology and the barcodes, we intro¬ 
duce the total boundary matrix M, which is a means of assembling all the boundary maps 5i. It is 
a square matrix of size Xi/s|o D] ^/ values in {0,1}. It is defined so that Mjj = 1 if and 

only if 

r i and j are respectively indices of I and (/ -I- l)-simphces, for some I 
[ simplex i belongs to simplex j 

Everytime an /-simplex is created, it is indexed with respect to the total boundary matrix and 
(/-hi) values are added to the matrix. We then perform a reduction method on the matrix. The 
key point here is that we can take our homology such that the coefficients are in the held Z/2Z, 
which removes any torsion groups that may arise. This means that when we are reducing the 
matrix we can make every entry either a 0 or I at every step, which makes the computation much 
easier. 

Another thing to note with the reduction is that we are dealing with sparse matrices. The 
original total boundary matrix is sparse as it is created in a combinatorical way, and it indeed stays 
sparse throughout reduction. Therefore we have coded the entire procedure for sparse matrices, 
making the calculation more efficient. 
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We perform the reuction as follows. Take the matrix M and we reduce it to a matrix R such 
that the the first Is in all columns appear on different rows. We do this by adding columns (and 
then reducing mod 2), we are not permitted to swap columns. Once we have the matrix R it is 
possible to see which holes were killed with the addition of a simplex, and if any holes were born. 
In particular, adding aj gives birth to a new homology class if column j of R is zero, and adding 
aj kills a homology class if the j column is non-zero in R. If we denote by R[i,j] its lowest 1, then 
we kill the homology class that was born when cjj was added (see |3] for more details). 

Using this method, we get an ej, of when a homology class is born and a of when the same 
class dies, if at all. We can then plot these into a barcode diagram as shown in figure 

Note that even with the improved simplex building algorithm described above, the complexity of 
building high-dimensional simplices is extremely high. We can tackle this by picking a maximum 
dimension D « N. An alternative we suggest here is to consider the following stopping criterion. 

Definition 1.7 (Connectedness stopping criterion).' 

We terminate the calculations as soon as the 0th persistant Betti number is equal to one. 

2. Using TDA to Compare Metrics for Low-Dimensional Visual Data 

In this section, we will apply the above philosophy to a data cloud in M^. We will be considering 
the following metrics for points A = {xi,yi) and B = {x 2 ,y 2 )- 

(1) Euclidean - di{A,B) = ^ (xi — X 2 )^ -I- (yi — 1 / 2 )^ 

(2) Taxicab - d 2 {A, B) = |xi — X 2 I + \yi — 2 / 2 ! 

(3) Supremum - d 3 {A, B) = max{|xi — X 2 I, |yi — 2 / 2 !} 

We create a data sample of n points in by uniformly generating them in a bounded subset 
parameterised by a collection of curves. Specifically for this example, we created 50 points in a 
region bounded by a circle with four holes, as shown in figure 



Remark 2A: 

As the data is ambient to we will not consider simplices of dimension greater then 2 as they 
will not encode any useful information. 

The figures on the following pages show the progression of the simplicial complex (restricted to 
0,1 and 2-simplices) that we get with respect to the different metrics on this space along with the 
resultant barcode of the persistent homology. Note that we have employed the stopping criterion 
here, so that when the space becomes connected we stop. 

We can now perform a visual comparison of the three barcodes. As expected, they are not 
too different for this example. The persistent Hq Betti numbers are extremely similar, the only 
differences appear to occur in the 1 and 2-dimensional holes. The taxicab has a much steeper 
increase in the number of 1-dimensional holes, whereas the other two have a more gradual increase, 
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and also seem to have longer lifespans. With respect to H 2 , there is a hole that is picked up by all 
of the metrics quite early on and persists throughout. 

Table gives the average number of holes, average lifespan of holes and the minimum/maximum 
lifespans of holes. 


Distance 

H^ 

No. of bars 

Avr. bar size 

Min. bar size 

Max. bar size 


0 

50 

0.53 

0.13 

1 

Euclidian 

1 

48 

0.23 

0.03 

0.55 


2 

24 

0.21 

0.03 

0.66 


0 

50 

0.49 

0.11 

1 

Taxicab 

1 

69 

0.18 

0.01 

0.53 


2 

50 

0.15 

0.01 

0.66 


0 

50 

0.53 

0.14 

1 

Supremum 

1 

51 

0.22 

0.01 

0.68 


2 

28 

0.19 

0.01 

0.64 


Table 1. Results for the different metrics for the considered cloud of data 


These figures validate our visual comparisons and also tell us a bit more information. The average 
number of bars for the Euclidean and supremum metrics are similar, while in the taxicab metric it 
follows a different pattern and has a very large value at Hi. 

We can see that for all three of the metrics, the average lifespan of a bar decreases as we increase 
the homology dimension. The Euclidean and supremum once again are extremely silimar. 

One place where the Euclidean and taxicab metric are similar is in the maximum bar length. 
It seems that the longest bar in Hi is detected earlier in the supremum metric. There are much 
shorter bars in the taxicab and supremum metric as compared to the Euclidean metric in the higher 
dimensions. This may be because the data was generated with respect to the Euclidean metric and 
we should expect less noise because of this. 
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(a) 1st step 


(b) 20th step 


(c) 40th step 



(d) 59th step 


(e) 79th step 


(f) 98th step 



Figure 4. Results for the Euclidian metric 
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(a) 1st step 


(b) 25th step 


(c) 48th step 



(d) 72th step 


(e) 95th step 


(f) 119th step 


(g) Barcode 




Figure 5. Results for the taxicab metric 


9 



















• J 


^ ESS 


(a) 1st step 


(b) 21th step 


(c) 41th step 



(g) Barcode 

Figure 6. Results for the supremum metric 
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3. Using TDA to Compare Metrics on Abstract Data 


3.1. Non-Transitive Dice. In this section, we will give a brief overview on the theory of non¬ 
transitive dice. This forms a very abstract data set which we will use as an example for comparing 
metrics on an abstract space. 

Definition 3.1; 

A n-sided die with values in the set K = [1, fc| is an ordered n-tuple d = [di,..., dn] where di e K. 
The collection of all such dice will be denoted K-D{n). 

Example 3.1; 

The standard 6-sided die would be represented by [1, 2,3,4, 5, 6] and is an element of [1,6]-D(6). 

We will only be considering a very special set of dice, namely those in [1,6|-D(6) where the sum 
of the number of their faces is 21, akin to the example above. We will denote this set of dice as 
PT(6). However, everything we do in this article works in the more general case of K-D{n). 

Definition 3.2 (Beating relations); 

Given two dice X and Y in K-D{n), we say that Y beats X if P(y > X) > We denote this 
Y » A. 


Dice endowed with such a beating relation have interesting properties that we wish to study. 
Namely, there exists cycles of beating relations, which we call non-transitive dice. We formally 
outline this below. 

Definition 3.3 (Cycles of non-transitive dice); 

A cycle of length r of non-transitive dice is a ordered collection of dice (Ai,..., A^) e K-V{n) such 
that: 

(1) Xi » Aj+i, VI ^ i ^ r - 1. 

(2) A^ » Ai. 


Example 3.2; 

The following diagram 



is a cycle of non-transitive dice in DE (6) of length 3. These particular dice are called the Grime 
dice (see m)- 

Definition 3.4 (Non-transitive dice); 

We say that a die A e K-D{n) is non-transitive if it appears in any non-transitive cycle. The 
subset of K-D{n) consisting of all non-transitive dice is denoted K-NTD{n). 

For our particular example of DT{6), we will denote the subset consisting of all non-transitive 
dice to he AfTT{6). 

Definition 3.5; 

Let A = [xi,..., x%\ be a dice in MTTifi)-, then its foliation constant is given by 

f(A) = xi — 1 -I- 6 — X6 


II 


Definition 3.6; 

Let X be a die in MTTifi). We define the symmetry constant of X, denoted s(X) by the following 
equation: 

2=1 ' 

Question 3.1; 

How can we describe the structure of the space MTTifi) with respect to the beating relations? 

Our main interest will be in answering the above question. We will tackle this approximately 
using topological data analysis. 

3.1.1. K-J\f'T'D{n) as a directed graph. We compute the space J\fTT{Q), and then we represent it 
as a directed graph. We create a vertex for each die, and then connect node Y to node X with a 
directed edge if H » X. Such a graph gives us lots of information. 

Example 3.3; 

Below is the directed graph associated to MTTifi) along with the associated beating probabilities. 



We can read off the cycles, for example 



9 

is a non-transitive cycle. We computationally find the longest cycle to be a 7-cycle given by; 
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Note that this 7 cycle is not unique, we can replace [1,1, 2, 5, 6, 6] by either [1,1,4,4, 5, 6] or 
[2, 2, 2, 5, 5, 5], as seen in the diagram. This occurs as all three of these dice share the same unique 
source and target, which limits the length of the longest cycle. 

3.1.2. Distances on the Non-Transitive Diee Graph. We can endow a distance matrix on the above 
graphs, which will then allow us to apply the topological data analysis tools. Our main metric 
will be the so-called similarity metric which will tell us how similar two dice are, such as the three 
interchangeable dice in the above example. 

Definition 3.7: 

We define the shortest path metric on MTTifi). Given two dice X and Y in MTTifi), we have 
d{X, Y) = shortest path from X to Y shortest path from Y to X. 

Note that this is well defined as there is always a path from X to T by construction. 

Proposition 3.1.' 

The shortest path distance is indeed a metric on MTTifi). 

Proof 3.1: 

- d[X, y) ^ 0 is clear as the shortest path between two nodes in a graph is always a positive 
number. 

- d{X, y) = 0 if and only if X = y also holds as if the shortest path between two nodes has 
length 0 then it means it exactly the same node. 

- d{X,Y) = d{Y,X) holds from definition. 

- d{X, Z) ^ d{X, y) -I- d{Y, Z) is true as we can always construct a path of equal length via 
concatenation. 

Definition 3.8; 

We define the shortest path matrix of MTTifi) to be the n x n matrix D[j\fTT{Q)) whose dij 
element is exactly d{Xi,Xj). Here n is the cardinal of NTT{Q), namely 10. 

Now that we have defined the distance metric, we have a way to compare non-transitive dice. 
What we do next is see how similar two dice are by comparing their distances to all other dice. 

Definition 3.9; 

Let D = D[MTT{fi>)) be the shortest path matrix, we define the n x n similarity matrix D by 
defining the elements 

Dij = djjn—1 (i4j, .Dj). 

Here is the Euclidean metric on Di is the i-th column of D with the i-th value removed 

(which will always be zero). 
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Definition 3.10; 

Two dice Xi and Xj in MTT{6) are said to be similar if Dij = 0. 


Remark 3.1; 

Note that this space is no longer a metric as it may not even be Hausdorff. Namely if we have any 
similar dice the distance between them will be zero. 


3.2. Comparing Metrics on N'T'T{tS). We previously defined the similarity distance on the space 
of non-transitive dice. We will now introduce some other metrics and then compare the resulting 
barcodes. 

Given two dice X = [xi,..., xe] and Y = [yi,... ,yQ]: 

(1) (Similarity Distance) di{X,Y) = Dxy 

(2) (Eucldean Metric) d 2 {X,Y) = ^ (xi — yfifi -I- • • • -I- (xg — ye)^ 

(3) (Foliation-Symmetry Distance) d‘i{X^Y) = |(s(X) -I- f(X)) — (s(y) -I- f(T))| 

Figure shows the barcodes associated to the space AfTT{6) with respect to the above three 
metrics. As before, we also present some of the relevant statistics that we can obtain from the 
barcodes in the table [H 


3.3. Interpretation of Results. The first obvious difference between the abstract space and the 
sample on is that we have less critical e values in the abstract case. This follows by construction 
as the data has no noise and less nodes. It is also clear that we made the right decision allowing 
the dimension of the simplices to be unbounded, there seems to be data encoded even in the 8 and 
9-dimensional components of the Rips complex. 

Between the three metrics on the non-transitive dice set, there seems to be a connection between 
the similarity metric and Euclidean metric. It appears that one could make a connection between 
the two. This would be an extremely useful thing to have as the Euclidean metric has a much lower 
computational complexity than the similarity distance. 

On the other hand, the foliation-symmetry distance does not seem to be too useful. There are 
not many critical values of e and a large number of holes get born and die simultaneously. There 
is also the issue that at e = 0 there are not many connected components, this is due to the fact 
that the way that metric has been constructed implies that there will be many nodes such that the 
distance between them is zero. 

We now utilize tableto discuss the problem further. The first thing to notice is how the number 
of bars increases then decreases as we increase the dimension. There is a peak at the level, 
suggesting this is where most information about our data lies. 

The minimum bar sizes of the foliation-symmetry distance are all equal, along with most of the 
maximum bar sizes being equal. This suggests that this metric is not very good at ascertaining the 
importance of holes, and therefore not suited to helping us describe the shape of our data. 

In conclusion, it does seem that we could use the Euclidean metric on the dice sets to at least 
estimate what sort of shape the directed graph with respect to the similarity metric should look 
like. 
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Distance 

Hi 

No. of bars 

Avr. bar size 

Min. bar size 

Max. bar size 


0 

7 

0.51 

0.25 

1 


1 

26 

0.18 

0.05 

0.5 


2 

58 

0.15 

0.05 

0.32 

Similarity 

3 

72 

0.15 

0.05 

0.32 

4 

56 

0.16 

0.06 

0.32 


5 

28 

0.17 

0.06 

0.21 


6 

8 

0.19 

0.06 

0.21 


7 

1 

0.21 

0.21 

0.21 


0 

10 

0.24 

0.09 

1 


1 

30 

0.22 

0.09 

0.55 


2 

61 

0.19 

0.09 

0.55 

Euclidian 

3 

79 

0.18 

0.09 

0.36 

4 

62 

0.18 

0.09 

0.36 


5 

30 

0.18 

0.18 

0.18 


6 

8 

0.18 

0.18 

0.18 


7 

1 

0.18 

0.18 

0.18 


0 

5 

0.4 

0.13 

1 


1 

27 

0.25 

0.13 

0.5 


2 

55 

0.25 

0.13 

0.5 

Foliation-Symmetry 

3 

69 

0.24 

0.13 

0.5 

4 

55 

0.23 

0.13 

0.5 


5 

28 

0.21 

0.13 

0.5 


6 

8 

0.17 

0.13 

0.5 


7 

1 

0.13 

0.13 

0.13 


Table 2. Results for the different distances on NTT{Q) 


4. Conclusion 

We have suggested a philosophy that topological data analysis has tools that can be applied to 
the problem of choosing what is a good distance on an arbitrary space. Analyzing the results is no 
easy task, we chose to implement a combination of statistics and visual observations to conclusions. 
We believe that there should be an easier and more accurate way to compare barcodes that tell us 
things about the structure in each dimension. If such a tool were to be developed, it would most 
definitely strengthen the theory that has been introduced in this paper. 

Our reasoning may also help us justify swapping out the true metric on a space for an approximate 
one with lower complexity, this would aid in high-end calculations. In our example, we have noted 
that the Euclidean distance could be adjusted to approximate the similarity distance on AATT(6). 
Not only is this of lower complexity, but it has also allowed us to draw conclusions about abstract 
spaces such as the non-transitive dice sets. 
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(a) Similarity metric 


(b) Eucliflfan metric 


(c) Foliation-Symmetry metric 


Figure 7. Barcdoes of AATT(6) 
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